Enhancing Generic Pipeline Model for Code Clone Detection using Divide and Conquer Approach

نویسندگان

  • Al-Fahim Mubarak-Ali
  • Sharifah Mashita Syed-Mohamad
  • Shahida Sulaiman
چکیده

Code clone is known as identical copies of the same instances or fragments of source codes in software. Current code clone research focuses on the detection and analysis of code clones in order to help software developers identify code clones in source codes and reuse the source codes in order to decrease the maintenance cost. Many approaches such as textual based comparison approach, token based comparison and tree based comparison approach have been used to detect code clones. As software grows and becomes a legacy system, the complexity of these approaches in detecting code clones increases. Thus, this scenario makes it more difficult to detect code clones. Generic pipeline model is the most recent code clone detection that comprises five processes which are parsing process, pre-processing process, pooling process, comparing processes and filtering process to detect code clone. This research highlights the enhancement of the generic pipeline model using divide and conquer approach that involves concatenation process. The aim of this approach is to produce a better input for the generic pipeline model by processing smaller part of source code files before focusing on the large chunk of source codes in a single pipeline. We implement and apply the proposed approach with the support of a tool called Java Code Clone Detector (JCCD). The result obtained shows an improvement in the rate of code clone detection and overall runtime performance as compared to the existing generic pipeline model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Free Vibration Analysis of Repetitive Structures using Decomposition, and Divide-Conquer Methods

This paper consists of three sections. In the first section an efficient method is used for decomposition of the canonical matrices associated with repetitive structures. to this end, cylindrical coordinate system, as well as a special numbering scheme were employed. In the second section, divide and conquer method have been used for eigensolution of these structures, where the matrices are in ...

متن کامل

INVERSE FREQUENCY RESPONSE ANALYSIS FOR PIPELINES LEAK DETECTION USING THE PARTICLE SWARM OPTIMIZATION

Inverse Transient Analysis (ITA) is a powerful approach for leak detection of pipelines. When the pipe transient flow is analyzed in frequency domain the ITA is called Inverse Frequency Response Analysis (IFRA). To implement an IFRA for leak detection, a transient state is initiated in the pipe by fast closure of the downstream end valve. Then, the pressure time history at the valve location is...

متن کامل

SVP { a Model Capturing Sets , Streams

We describe the SVP data model. The goal of SVP is to model both set and stream data, and to model parallelism in bulk data processing. SVP also shows promise for other parallel processing applications. SVP models collections, which include sets and streams as special cases. Collections are represented as ordered tree structures, and divide-and-conquer mappings are easily deened on these struct...

متن کامل

Comparing MapReduce and Pipeline Implementations for Counting Triangles

A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the...

متن کامل

Preliminary Result of Parallel Double Divide and Conquer

This paper shows a concept for parallelization of double Divide and Conquer and its preliminary result. For singular value decomposition, double Divide and Conquer was recently proposed. It first computes singular values by a compact version of Divide and Conquer. The corresponding singular vectors are then computed by twisted factorization. The speed and accuracy of double Divide and Conquer a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. Arab J. Inf. Technol.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2015